Goto

Collaborating Authors

 Connecticut


Asymmetric Scaling Laws from Sparse Features

arXiv.org Machine Learning

We introduce a model for neural scaling laws under sparse activations. In the model, test loss is often dominated by rare coordinates that are never observed in the training input. This mechanism induces a novel bottleneck absent from dense models. We derive the asymptotic population loss in both the underparameterized and overparameterized regimes, and show that the loss exhibits a double-descent peak near the interpolation threshold -- where the number of parameters is just sufficient to fit the training data -- resulting in a loss curve governed by two distinct scaling exponents -- one for the overparameterized regime and one for the underparameterized regime -- with a gap determined by the degree of sparsity. Additionally, we derive a compute-optimal frontier that favors increasing dataset size over model capacity under fixed compute budgets. We also analyze gradient-descent dynamics and identify a scaling law for the probability that fixed-step gradient descent becomes unstable. We further show that the sparsity-induced effect persists under nonlinear activations.


Forecasting Medium-Horizon Alzheimer's Disease Progression: Residual Gap-Aware Transformers for 24-Month CDR-SB Change from ADNI Clinical and Biomarker Histories

arXiv.org Machine Learning

Medium-horizon Alzheimer's disease progression prediction is difficult because future clinical scores can remain tied to baseline severity, while biomarker histories are irregular and incompletely observed. We develop an anchor-based analysis of 24-month Clinical Dementia Rating Sum of Boxes (CDR-SB) change using harmonized Alzheimer's Disease Neuroimaging Initiative (ADNI) tables. Each labeled sample is anchored at a mild cognitive impairment visit, uses only clinical and biomarker history observed at or before that anchor, and defines the response as CDR-SB at the future visit closest to 24 months within an 18--30 month window minus anchor CDR-SB. The analytic cohort contains 2,600 labeled anchors from 858 participants and 7,276 longitudinal rows. We propose a residual gap-aware transformer that combines a mixed-effects statistical reference with transformer-based residual learning from pre-anchor clinical and biomarker histories. The model uses participant-level random intercepts in the mixed-effects reference, observation-level triplet tokenization for irregular histories, and a learned nonnegative time-gap penalty inside self-attention. We compare the proposed model with a Bayesian-information-criterion-selected linear mixed-effects baseline, GRU-D, and STraTS under repeated participant-level train--test splits. Across five participant-level random seeds, the proposed model achieves the best mean test performance across all reported metrics, reducing MSE by 13.1% and increasing prediction--observation correlation by 26.4% relative to the mixed-effects baseline. It also improves over both GRU-D and STraTS in mean error and correlation. These results show that statistical anchoring and gap-aware residual learning provide a useful structure for medium-horizon Alzheimer's disease progression prediction.


License plate cameras at Home Depot and Lowe's spark privacy fears

FOX News

Home Depot and Lowe's stores in Connecticut are reportedly using automated license plate readers in parking lots to prevent theft, raising privacy concerns about data access.


Spectral Lens: Activation and Gradient Spectra as Diagnostics of LLM Optimization

arXiv.org Machine Learning

Training loss and throughput can hide distinct internal representation in language-model training. To examine these hidden mechanics, we use spectral measurements as practical and operational diagnostics. Using a controlled family of decoder-only models adapted from the modded NanoGPT codebase, we introduce an empirical protocol based on activation covariance and per-sample gradient SVD spectra. This dual-view reveals three empirical findings and one mechanistic explanation. First, batch size acts as a latent determinant of representation geometry: runs that reach equal loss settle into systematically distinct activation spectra. Second, the activation covariance tail measured early in training reliably forecasts downstream token efficiency. Third, movement of the activation spectrum head (leading modes), together with gradient spectra, characterizes underlying learning-dynamics changes, separating learning-side architectural improvements from primarily execution-side gains. These predictive and diagnostic signals persist across the 12-, 36-, and 48-layer model tiers. Finally, a mechanistic model proves the main observations and explains how activation covariance spectra correlate with task-aligned feature learning.


In 1934, Chrysler bet big on teardrop-shaped cars

Popular Science

The streamline shape is still more aerodynamic than most cars today. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. In 1930, English engineer Sir Dennis Burney told Popular Science that his teardrop-shaped car would cut fuel consumption in half. Breakthroughs, discoveries, and DIY tips sent six days a week. From the start, cars were built wrong. At least, that's what Chrysler's head of automotive research, Carl Breer, thought in 1930. Automobiles had never been built to be aerodynamic, he posited, and he was right.


Anchored Variational Inference for Personalized Sequential Latent-State Models

arXiv.org Machine Learning

Sequential latent-variable models with subject-specific random effects provide a flexible framework for modeling temporally structured data with both local latent dynamics and stable between-subject heterogeneity. In such models, conditional inference for the local latent process is often tractable, but integrating over subject-specific random effects can be computationally demanding. We propose an anchored variational inference framework for efficient approximate inference in this setting. The central idea is to replace the full conditional posterior of the local latent process with its evaluation at a representative value of the subject-specific latent effect, called the anchor point, thereby preserving tractable local inference while substantially reducing computational cost. This approximation is especially appealing in sequential settings, where the posterior distribution of the random effect becomes increasingly concentrated as the sequence length grows. Under suitable conditions, we show that the posterior mean is a nearly optimal anchor point and that the resulting anchored variational EM (AVEM) algorithm approximately preserves the local monotonicity behavior of standard variational inference. We instantiate the framework in two representative classes of sequential latent-variable models, namely mixed hidden Markov models and mixed-effects state-space models, derive the corresponding AVEM algorithms, and use simulation studies to indicate that the resulting methods achieve accurate estimation with substantial computational gains. We also discuss a partially anchored variant of the framework, in which only the components of the subject-specific latent effect whose posteriors are well concentrated are anchored.


1 in 50 million split-colored lobster found in Massachusetts

Popular Science

The three-pound crustacean will live at an aquarium, offering a fun genetics lesson. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. The exciting discovery offers a lesson in genetics. Breakthroughs, discoveries, and DIY tips sent six days a week. A two-toned lobster is set to make a splash at the Woods Hole Science Aquarium in southeastern Massachusetts.


People keep trespassing near cave filled with bats infected by Ebola's cousin

Popular Science

Environment Animals Wildlife Bats People keep trespassing near cave filled with bats infected by Ebola's cousin The Marburg virus disease can reach a nearly 90 percent mortality rate. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. Epidemiologists believe the Marburg virus disease is primarily transmitted to humans through Egyptian fruit bats. Breakthroughs, discoveries, and DIY tips sent six days a week. You do not want to contract Marburg virus disease (MVD).


How to tell eagle parents Jackie and Shadow apart

Popular Science

Jackie looks stern, while Shadow looks a bit more surprised. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. Jackie and Shadow have been nesting together since 2018. Breakthroughs, discoveries, and DIY tips sent six days a week. The two new eaglets eating, chirping, and "bopping" in their nest high above Southern California's Big Bear Lake are arguably the stars of the popular wildlife livestream .


Earth's largest otters have chocolate bar-sized babies

Popular Science

Environment Animals Wildlife Endangered Species Earth's largest otters have chocolate bar-sized babies Chester Zoo celebrates the birth of giant otter triplets. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. While they only weigh 7.1 ounces as babies, giant otters can grow to six-feet-long and weigh up to 71 pounds. Breakthroughs, discoveries, and DIY tips sent six days a week. It turns out that giant otter () newborns are actually quite small, weighing just around 7.1 ounces.